Biostatistics For Dummies (Monika Wahi John Pezzullo)

positive differences would cancel each other out. You want all differences (positive and negative) to

contribute to the overall measure of how far your observations are from what you expected under

Instead of summing the differences, statisticians prefer to sum the squares of differences, because the

squares are always positive. This is exactly what’s done in the chi-square test. Figure 12-5 shows the

squared scaled differences, which are calculated from the observed and expected counts in Figures

12-1 and 12-2 using the formula

(rather than by squaring the rounded-off numbers in

Figure 12-4, which would be less accurate).

© John Wiley & Sons, Inc.

FIGURE 12-5: Components of the chi-square statistic: squares of the scaled differences.

You then add up these squared scaled differences:

get the chi-square test statistic. This sum is an excellent test statistic to measure the overall

departure of your data from the null hypothesis:

If the null hypothesis is true (use of CBD or NSAID does not impact pain relief status), this

statistic should be quite small.

If one of the levels of treatment has a disproportionate association with the outcome (in either

direction), it will affect the whole table, and the result will be a larger test statistic.

Determining the p value

Now that you calculated the test statistic, the only remaining task before interpretation is to determine

the p value. The p value represents the probability that random fluctuations alone, in the absence of

any true effect of CBD or NSAIDs on pain relief, could lead to a value of 8.81 or greater for this test

statistic. (We introduce p values in Chapter 3.) Once again, the rigorous proof is very complicated, so

we present an informal explanation:

When the expected cell counts are very large, the Poisson distribution becomes very close to a normal

distribution (see Chapter 24 for more on the Poisson distribution). If the

is true, each scaled

difference should be an approximately normally distributed random variable with a mean of zero and a

standard deviation of 1. The mean is zero because you subtract the expected value from the observed

value, and the standard deviation is 1 because it is divided by the SE. The sum of the squares of one or

more normally distributed random numbers is a number that follows the chi-square distribution (also